Evaluating Language Models for Knowledge Base Completion
نویسندگان
چکیده
Structured knowledge bases (KBs) are a foundation of many intelligent applications, yet notoriously incomplete. Language models (LMs) have recently been proposed for unsupervised base completion (KBC), yet, despite encouraging initial results, questions regarding their suitability remain open. Existing evaluations often fall short because they only evaluate on popular subjects, or sample already existing facts from KBs. In this work, we introduce novel, more challenging benchmark dataset, and methodology tailored realistic assessment the KBC potential LMs. For automated assessment, curate dataset called WD-Known, which provides an unbiased random Wikidata, containing over 3.9 million facts. second step, perform human evaluation predictions that not in KB, as real insights into added value Our key finding is biases conception previous benchmarks lead to systematic overestimate LM performance KBC. However, our results also reveal strong areas We could, example, significant Wikidata relations nativeLanguage, by factor $$\sim $$ 21 (from 260k 5.8M) at $$82\%$$ precision, citizenOf 0.3 4.2M 5.3M) 90% precision. Moreover, find LMs possess surprisingly generalization capabilities: even where most were directly observed training, prediction quality can be high. open-source code. ( https://github.com/bveseli/LMsForKBC ).
منابع مشابه
Lations for Knowledge Base Completion
In this work we present a novel approach for the utilization of observed relations between entity pairs in the task of triple argument prediction. The approach is based on representing observations in a shared, continuous vector space of structured relations and text. Results on a recent benchmark dataset demonstrate that the new model is superior to existing sparse feature models. In combinati...
متن کاملCompositional Vector Space Models for Knowledge Base Completion
Knowledge base (KB) completion adds new facts to a KB by making inferences from existing facts, for example by inferring with high likelihood nationality(X,Y) from bornIn(X,Y). Most previous methods infer simple one-hop relational synonyms like this, or use as evidence a multi-hop relational path treated as an atomic feature, like bornIn(X,Z)→ containedIn(Z,Y). This paper presents an approach t...
متن کاملCommonsense Knowledge Base Completion
We enrich a curated resource of commonsense knowledge by formulating the problem as one of knowledge base completion (KBC). Most work in KBC focuses on knowledge bases like Freebase that relate entities drawn from a fixed set. However, the tuples in ConceptNet (Speer and Havasi, 2012) define relations between an unbounded set of phrases. We develop neural network models for scoring tuples on ar...
متن کاملKnowledge Base Completion using Compositional Vector Space Models
Traditional approaches to knowledge base completion have been based on symbolic representations. Low-dimensional vector embedding models proposed recently for this task are attractive since they generalize to possibly unlimited sets of relations. A significant drawback of previous embedding models for KB completion is that they merely support reasoning on individual relations (e.g., bornIn(X,Y ...
متن کاملNeighborhood Mixture Model for Knowledge Base Completion
Knowledge bases are useful resources for many natural language processing tasks, however, they are far from complete. In this paper, we define a novel entity representation as a mixture of its neighborhood in the knowledge base and apply this technique on TransE—a well-known embedding model for knowledge base completion. Experimental results show that the neighborhood information significantly ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2023
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-33455-9_14